Boosting Exploration in Actor-Critic Algorithms by Incentivizing Plausible Novel States

Banerjee, Chayan; Chen, Zhiyong; Noman, Nasimul

Title: Boosting Exploration in Actor-Critic Algorithms by Incentivizing Plausible Novel States
Creator: Banerjee, Chayan; Chen, Zhiyong; Noman, Nasimul
Relation: 2023 62nd IEEE Conference on Decision and Control (CDC). Proceedings of 2023 62nd IEEE Conference on Decision and Control (CDC) (Singapore 13-15 December, 2023) p. 7009-7014
Publisher Link: http://dx.doi.org/10.1109/CDC49753.2023.10383350
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Resource Type: conference paper
Date: 2023
Description: Improvement of exploration and exploitation using more efficient samples is a critical issue in reinforcement learning algorithms. A basic strategy of a learning algorithm is to facilitate indiscriminate exploration of the entire environment state space, as well as to encourage exploration of rarely visited states rather than frequently visited ones. Under this strategy, we propose a new method to boost exploration through an intrinsic reward, based on the measurement of a state's novelty and the associated benefit of exploring the state, collectively called plausible novelty. By incentivizing exploration of plausible novel states, an actor-critic (AC) algorithm can improve its sample efficiency and, consequently, its training performance. The new method is verified through extensive simulations of continuous control tasks in MuJoCo environments, using a variety of prominent off-policy AC algorithms.
Subject: training; reinforcement learning; boosting; stability analysis
Identifier: http://hdl.handle.net/1959.13/1502678
Identifier: uon:55265
Identifier: ISBN:9798350301243
Identifier: ISSN:2576-2370
Language: eng
Reviewed

Hits: 474
Visitors: 472
Downloads: 0

		Thumbnail	File	Description	Size	Format